Incremental Extract Rules

An incremental extract rule can be specified to maintain the size of the extract file. A rule defines the extract checkpoints to keep. After each incremental extract the rule is applied to remove any extract checkpoints that fall outside the definition of the rule.

FastStats Designer will trim the input file so that checkpoints that fit the rule are always included. If no suitable checkpoint can be found then no file trim will occur. The number of records trimmed will be reported in the build log.

Deleting the extract file will cause a full extract to take place and reset all checkpoints.

Several rule types can be defined:

Limit by extract age (hours / days / months / years)

The cut off age of an extract checkpoint can be defined in Hours, Days, Months or Years. Checkpoints are removed from the start of extract file.

Creates a sliding window of extracts to records to only recent changes to limit the extract file size.

Limit by number of records

The cut off age of an extract checkpoint is defined by the total number of records in the extract file. Checkpoints are removed from the start of extract file.

Creates a sliding window of extracts to records to only recent changes to limit the total extract file size.

Discard recent Extracts (hours or days)

Keeps older records in the extract file and discards recent extracts. This causes recent data to be re-extracts every build. This is useful if recent data is volatile and subject to change while older data is static.

Example: Discard recent Extracts (14 days). This allows checkpoints older than 14 days to persist in the extract file but more recent checkpoints will be trimmed from the end of the file. This is useful if the data extracted in the last 14 days is subject to change, or more information gets added over the 14 days.

Discard all extracts with ERN > X

Keeps older records in the extract file and discards all extracts since a user specified ERN. This causes recent data to be re-extracted every build. This is useful if recent data is volatile and subject to change while older data is static.

Discard all but first X extract(s)

Keeps older records in the extract file and discards all extracts since a user specified checkpoint. This causes recent data to be re-extracted every build. This is useful if recent data is volatile and subject to change while older data is static.

Example: Discard all extracts with ERN > 10000 will trim the extract file back to the most recent checkpoint where the ERN is <= 10000. This allows you to perform some initial builds with a static set of data and then all subsequent extracts contain volatile data.

Example: Discard all but first 1 extracts will trim the extract file back to the first checkpoint on every subsequent build. This allows you to perform an initial build with a static set of data and then all subsequent extracts contain volatile data.